MEPS_IBTS_MLEbins

Andrew Edwards

Analyses of IBTS data using the MLEbins method

This vignette analyses the IBTS data using the MLEbins method that explicitly accounts for the species-specific body-mass bins.

Creates Figure 6 (and related Figures S.1, S.2 and S.3) showing species-specific body mass bins resulting from the length bins, Figure 8 (comparison of MLE and MLEbins values of b through time) and MLEbins row of Table S.1.

library(sizeSpectra)
library(tibble)  # Else prints all of a tibble
data = IBTS_data
data
#> # A tibble: 42,298 x 7
#>     Year SpecCode LngtClass  Number    LWa   LWb bodyMass
#>    <int>    <int>     <dbl>   <dbl>  <dbl> <dbl>    <dbl>
#>  1  1986   105814        45 0.00714 0.0031  3.03     315.
#>  2  1986   105814        46 0.00714 0.0031  3.03     337.
#>  3  1986   105814        50 0.00714 0.0031  3.03     434.
#>  4  1986   105814        52 0.0293  0.0031  3.03     489.
#>  5  1986   105814        53 0.0109  0.0031  3.03     518.
#>  6  1986   105814        54 0.0113  0.0031  3.03     548.
#>  7  1986   105814        56 0.0218  0.0031  3.03     612.
#>  8  1986   105814        57 0.0188  0.0031  3.03     646.
#>  9  1986   105814        58 0.0381  0.0031  3.03     680.
#> 10  1986   105814        59 0.0327  0.0031  3.03     717.
#> # ... with 42,288 more rows

Determining which rows are 0.5 cm bins

LngtClass for all species is the minimum value of a 1-cm-width bin, except for herring (Clupea harengus) and sprat (Sprattus sprattus) for which lengths are rounded down to 0.5 cm values (so the bins are 0.5-cm wide). The SpecCode values for these are:

herringCode = dplyr::filter(specCodeNames, species == "Clupea harengus")$speccode
herringCode
#> [1] 126417
spratCode = dplyr::filter(specCodeNames, species == "Sprattus sprattus")$speccode
spratCode
#> [1] 126425
specCode05 = c(herringCode, spratCode)      # species codes with 0.5cm length bins

Verified earlier that only these two species have 0.5 cm values for LngtClass.

Append the max of the bin breaks for each row

So LngtClass is the minimum of each length bin. Need to work out the maximum of each length bin LengthMax, and then use the species-specific length-weight relationships to give the min (wmin) and max (wmax) of each body-mass bin. So create dataBin table dataframe that has LengthMax, wmin and wmax as extra columns for each row:

dataBin = dplyr::mutate(data,
                        LngtMax = LngtClass + 1)
aa = which(dataBin$SpecCode %in% specCode05)           # row numbers for herring, sprat
dataBin[aa, "LngtMax"] = dataBin[aa, "LngtMax"] - 0.5  # subtract 0.5 cm to
                                                       # give 0.5-cm wide bins
unique(dataBin$LngtMax - dataBin$LngtClass)            # correctly just has 0.5 and 1
#> [1] 1.0 0.5
unique( dplyr::filter(dataBin, LngtMax - LngtClass == 0.5)$SpecCode)  # just herring,sprat
#> [1] 126417 126425

dataBin = dplyr::mutate(dataBin, wmax = LWa * LngtMax^LWb)  # calculate max body mass
                                                            # for each bin (min
                                                            # is currently bodyMass)
dataBin = dplyr::rename(dataBin, LngtMin = LngtClass)       # For consistency
dataBin = dplyr::rename(dataBin, wmin = bodyMass)

dataBin = dataBin[ , c("Year", "SpecCode", "LngtMin", "LngtMax",
                       "LWa", "LWb", "wmin", "wmax", "Number")]     # Reorder columns

range(dplyr::mutate(dataBin,
                    wminCheck = LWa * LngtMin^LWb)$wminCheck - dataBin$wmin)
#> [1] -2.273737e-13  2.273737e-13
                                              # Verifying that wmin is correct
                                              # (was calculated independently)
length(unique(dataBin$SpecCode))
#> [1] 135

This is the code to then save dataBin as a data set in the package, but is not run here:

usethis::use_data(dataBin, overwrite = TRUE)

Plot the resulting body mass bins

So there are 135 uniques species. Now going to plot the resulting body mass bins for each species, with 45 on each figure. This gives Figures 6, S.1, S.2 and S.3. This function wrangles the data, calculates some useful values and plots all four figures:

res <- species_bins_plots()

Those four figures show how the length bins for each species get converted to body mass bins. The conversions are different for each species because of the different values of the length-weight coefficients. Even with 1-cm length bins (and 0.5-cm for herring and sprat) the resulting body-mass bins can span a large range. See paper for further details.

Two species are highlighted: Triglops murrayi is Moustache Sculpin (code 127205). Lumpenus lampretaeformis is Snakeblenny (code 154675). Data for these are:

dataHighlight = dplyr::filter(data,
                              SpecCode %in% c(127205, 154675))
dataHighlightSumm = dplyr::summarise(dplyr::group_by(dataHighlight,
                                                     SpecCode),
                                     minLngt = min(LngtClass),
                                     maxLngt = max(LngtClass),
                                     LWa = unique(LWa),
                                     LWb = unique(LWb))
dataHighlightSumm
#> # A tibble: 2 x 5
#>   SpecCode minLngt maxLngt    LWa   LWb
#>      <int>   <dbl>   <dbl>  <dbl> <dbl>
#> 1   127205       8      15 0.0088  3   
#> 2   154675      13      37 0.0244  2.04

The widest resulting body-mass bin occurs for Atlantic Cod (Gadus morhua), which is the rightmost species in the final figure above figure. The widest bin has a width of 832 g.

Likelihood calculations using MLEbins method

Now use the MLEbins method to fit each year of data in turn.

fullYears = sort(unique(dataBin$Year))
# Do a loop for each year, saving all the results in MLEbins.nSeaFung.new
for(iii in 1:length(fullYears))
  {
    dataBinForLike = dplyr::filter(dataBin,
                                   Year == fullYears[iii])
    dataBinForLike = dplyr::select(dataBinForLike,
                                   SpecCode,
                                   wmin,
                                   wmax,
                                   Number)
    n = sum(dataBinForLike$Number)
    xmin = min(dataBinForLike$wmin)
    xmax = max(dataBinForLike$wmax)

    MLEbins.nSeaFung.oneyear.new  = calcLike(negLL.fn = negLL.PLB.bins.species,
                                             p = -1.9,
                                             suppress.warnings = TRUE,
                                             dataBinForLike = dataBinForLike,
                                             n = n,
                                             xmin = xmin,
                                             xmax = xmax)

    if(iii == 1)
    {
      MLEbins.nSeaFung.new = data.frame(Year = fullYears[iii],
                                        xmin=xmin,
                                        xmax=xmax,
                                        n=n,
                                        b=MLEbins.nSeaFung.oneyear.new$MLE,
                                        confMin=MLEbins.nSeaFung.oneyear.new$conf[1],
                                        confMax=MLEbins.nSeaFung.oneyear.new$conf[2])
    } else {
      MLEbins.nSeaFung.new = rbind(MLEbins.nSeaFung.new,
                                   c(fullYears[iii],
                                     xmin,
                                     xmax,
                                     n,
                                     MLEbins.nSeaFung.oneyear.new$MLE,
                                     MLEbins.nSeaFung.oneyear.new$conf[1],
                                     MLEbins.nSeaFung.oneyear.new$conf[2]))
   }
}

# Need the standard error for weighted linear regression,
#  see eightMethods.count() for details:
MLEbins.nSeaFung.new = dplyr::tbl_df(MLEbins.nSeaFung.new)
MLEbins.nSeaFung.new = dplyr::mutate(MLEbins.nSeaFung.new,
                                     stdErr = (abs(confMin-b) +
                                               abs(confMax-b))/(2*1.96) )
MLEbins.nSeaFung.new
#> # A tibble: 30 x 8
#>     Year  xmin   xmax     n     b confMin confMax  stdErr
#>    <dbl> <dbl>  <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>
#>  1  1986  4.05 28723. 4799. -1.49   -1.50   -1.47 0.00791
#>  2  1987  4.16 25974. 6203. -1.53   -1.54   -1.51 0.00714
#>  3  1988  4.06 29440. 7712. -1.60   -1.62   -1.59 0.00714
#>  4  1989  4.06 35211. 7668. -1.55   -1.56   -1.54 0.00663
#>  5  1990  4.05 34811. 6400. -1.52   -1.53   -1.50 0.00714
#>  6  1991  4.06 26413. 6640. -1.49   -1.50   -1.48 0.00663
#>  7  1992  4.05 25317. 7973. -1.53   -1.54   -1.52 0.00663
#>  8  1993  4.16 29440. 8628. -1.49   -1.50   -1.48 0.00587
#>  9  1994  4.16 22201. 6283. -1.52   -1.53   -1.50 0.00714
#> 10  1995  4.16 31819. 9402. -1.62   -1.64   -1.61 0.00663
#> # ... with 20 more rows

Now to plot the results and obtain the regression fit:

res = timeSerPlot(MLEbins.nSeaFung.new,
                  legName = "(a) MLEbins",
                  yLim = c(-2.2, -0.9),
                  xLab = "Year",
                  method = "MLEbins",
                  legPos = "bottomleft",
                  weightReg = TRUE,
                  xTicksSmallInc = 1,
                  yTicksSmallInc = 0.05)

The statistics for the regression fit, the final row in Table S.1 are:

trendResultsMLEbinsNew = dplyr::tbl_df(res)
knitr::kable(dplyr::select(trendResultsMLEbinsNew, Method, Low, Trend, High, p, Rsquared),
             digits=c(NA, 4, 4, 4, 2, 2))
Method Low Trend High p Rsquared
MLEbins -0.0043 -0.001 0.0024 0.56 0.01

And use the results to plot Figure 8, comparing results from the original MLE method with those from the MLEbins method.

fullResults.MLEbins = MLEbins.nSeaFung.new  # Should really have just used
                                        # MLEbins..; happened to include nSeaFung early on
trend.MLEbins.new = dplyr::filter(trendResultsMLEbinsNew,
                                  Method == "MLEbins")
fullResults.MLE = dplyr::filter(fullResults, Method == "MLE")

bYears = fullResults.MLE$Year
MLE.col = "blue"
MLEbins.col = "red"
# postscript("nSeaFungCompareTrendsCol.eps", height = 6.3,
#            width = 7.5,
#            horizontal=FALSE,  paper="special")
res.MLE = timeSerPlot(fullResults.MLE,
                      legName = "",
                      xLim = range(bYears),
                      yLim = c(-1.82, -1.35),
                      xLab = "Year",
                      method = "MLE",
                      legPos = "bottomleft",
                      weightReg = TRUE,
                      bCol = MLE.col,
                      confCol = MLE.col,
                      pchVal = 19,
                      regPlot = FALSE,
                      regColNotSig = "lightblue",
                      regColSig = "darkblue",
                      xTicksSmallInc = 1,
                      yTicksSmallInc = 0.02,
                      legExtra = c("MLEbins", "MLE"),
                      legExtraCol = c(MLEbins.col, MLE.col),
                      legExtraPos = "topleft",
                      xJitter = -0.03)       # MLEbins on top as values are higher in figure

res.MLEbins.new = timeSerPlot(fullResults.MLEbins,
                              legName = "",
                              method = "MLEbins",
                              weightReg = TRUE,
                              newPlot = FALSE,
                              bCol = MLEbins.col,
                              confCol = MLEbins.col,
                              pchVal = 19,
                              regPlot = FALSE,
                              regColNotSig = "pink",
                              regColSig = "darkred",
                              xJitter = 0.03)

# dev.off()

For Table S.2 (results for each year for the MLEbins method), need the constant C for each year, so calculate it here:

MLEbins.res = MLEbins.nSeaFung.new
MLEbins.res = dplyr::mutate(MLEbins.res,
                            C = (b != -1 ) * (b+1) / ( xmax^(b+1) - xmin^(b+1) ) +
                                (b == -1) * 1 / ( log(xmax) - log(xmin) )
                           )
MLEbins.res = dplyr::select(MLEbins.res, -stdErr)
knitr::kable(dplyr::select(MLEbins.res, Year, xmin, xmax, n, confMin, b,
                           confMax, C),
             digits=c(0, rep(2, 7)))
Year xmin xmax n confMin b confMax C
1986 4.05 28722.51 4799.29 -1.50 -1.49 -1.47 0.97
1987 4.16 25974.17 6202.85 -1.54 -1.53 -1.51 1.12
1988 4.06 29439.75 7711.72 -1.62 -1.60 -1.59 1.41
1989 4.06 35210.99 7667.87 -1.56 -1.55 -1.54 1.20
1990 4.05 34811.19 6399.75 -1.53 -1.52 -1.50 1.07
1991 4.06 26412.52 6639.58 -1.50 -1.49 -1.48 0.99
1992 4.05 25316.66 7973.32 -1.54 -1.53 -1.52 1.12
1993 4.16 29439.75 8628.05 -1.50 -1.49 -1.48 1.00
1994 4.16 22200.90 6282.84 -1.53 -1.52 -1.50 1.10
1995 4.16 31818.64 9401.90 -1.64 -1.62 -1.61 1.52
1996 4.05 36462.12 5607.19 -1.44 -1.42 -1.41 0.78
1997 4.16 19360.99 9113.14 -1.65 -1.63 -1.62 1.58
1998 4.06 22801.53 6927.12 -1.57 -1.55 -1.54 1.22
1999 4.06 36462.12 7199.60 -1.60 -1.58 -1.57 1.33
2000 4.06 22801.53 10816.80 -1.61 -1.60 -1.59 1.41
2001 4.05 18824.87 9058.45 -1.46 -1.45 -1.44 0.85
2002 4.06 20464.76 6766.12 -1.44 -1.42 -1.41 0.79
2003 4.05 21611.31 5614.19 -1.45 -1.44 -1.42 0.82
2004 4.05 30911.23 5445.26 -1.52 -1.50 -1.49 1.02
2005 4.05 19360.99 4189.79 -1.39 -1.38 -1.36 0.67
2006 4.05 21032.63 7864.87 -1.63 -1.62 -1.60 1.48
2007 4.06 18299.10 5278.55 -1.52 -1.50 -1.49 1.03
2008 4.06 28722.51 6516.37 -1.57 -1.56 -1.54 1.23
2009 4.05 24036.35 5628.83 -1.59 -1.58 -1.56 1.30
2010 4.06 23293.68 8049.43 -1.54 -1.53 -1.52 1.13
2011 4.06 26723.46 11259.91 -1.49 -1.48 -1.47 0.94
2012 4.06 14899.37 7670.79 -1.53 -1.51 -1.50 1.07
2013 4.08 25974.17 5736.94 -1.57 -1.56 -1.54 1.23
2014 4.06 30025.12 8122.67 -1.63 -1.62 -1.61 1.48
2015 4.08 18362.51 9072.14 -1.76 -1.75 -1.73 2.14

This is saved (but not run here in vignette):

usethis::use_data(MLEbins.res, overwrite = TRUE)